CIEL: A Universal Execution Engine for Distributed Data-Flow Computing
نویسندگان
چکیده
This paper introduces CIEL, a universal execution engine for distributed data-flow programs. Like previous execution engines, CIEL masks the complexity of distributed programming. Unlike those systems, a CIEL job can make data-dependent control-flow decisions, which enables it to compute iterative and recursive algorithms. We have also developed Skywriting, a Turingcomplete scripting language that runs directly on CIEL. The execution engine provides transparent fault tolerance and distribution to Skywriting scripts and highperformance code written in other programming languages. We have deployed CIEL on a cloud computing platform, and demonstrate that it achieves scalable performance for both iterative and non-iterative algorithms.
منابع مشابه
Non-Deterministic Parallelism Considered Useful
The development of distributed execution engines has greatly simplified parallel programming, by shielding developers from the gory details of programming in a distributed system, and allowing them to focus on writing sequential code [8, 11, 18]. The “sacred cow” in these systems is transparent fault tolerance, which is achieved by dividing the computation into atomic tasks that execute determi...
متن کاملCondensing the cloud: running CIEL on many-core
Distributed execution engines have revolutionised data processing by making parallel programming simple. Systems such as MapReduce [10], Dryad [13] and Hadoop [1] can achieve massive throughput when running on thousands of commodity servers, yet generally only require the programmer to provide sequential code. These systems were designed to scale out across many worker machines, each of which h...
متن کاملAn Effective Task Scheduling Framework for Cloud Computing using NSGA-II
Cloud computing is a model for convenient on-demand user’s access to changeable and configurable computing resources such as networks, servers, storage, applications, and services with minimal management of resources and service provider interaction. Task scheduling is regarded as a fundamental issue in cloud computing which aims at distributing the load on the different resources of a distribu...
متن کاملA polyglot approach to cloud programming
Tools for programming distributed environments generally fall into one of two camps. Some, such as MapReduce or Dryad, insulate the programmer from many of the difficulties of distributed programming, at the expense of restricting the programming model. Others, such as MPI, provide a much more generic framework but force developers to consider all of the low-level details. We argue that this is...
متن کاملLocality Aware Fair Scheduling for Hammr
Hammr is a distributed execution engine for data parallel applications modeled after Dryad. In this report, we present a locality aware fair scheduler for Hammr. We have developed functionality to support hierarchical scheduling, preemption and weighed users and a minimum flow based algorithm to maximize task preference. For evaluation, we’ve run Hammr on Hadoop Distributed File System on Amazo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011